XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge
نویسندگان
چکیده
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora. However, most pre-trained models neglect multilingual knowledge, which is language agnostic but comprises abundant cross-lingual structure alignment. In this paper, we propose XLM-K, a model incorporating knowledge in pre-training. XLM-K augments existing with two tasks, namely Masked Entity Prediction Task Object Entailment Task. We evaluate on MLQA, NER XNLI. Experimental results clearly demonstrate significant improvements over models. The MLQA exhibit the superiority of related tasks. success XNLI shows better transferability obtained XLM-K. What more, provide detailed probing analysis to confirm desired captured our regimen. code available at https://github.com/microsoft/Unicoder/tree/master/pretraining/xlmk.
منابع مشابه
Cross-lingual thesaurus for multilingual knowledge management
The Web is a universal repository of human knowledge and culture which has allowed unprecedented sharing of ideas and information in a scale never seen before. It can also be considered as a universal digital library interconnecting digital libraries in multiple domains and languages. Beside the advance of information technology, the global economy has also accelerated the development of inter-...
متن کاملMultilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment
Many recent works have demonstrated the benefits of knowledge graph embeddings in completing monolingual knowledge graphs. Inasmuch as related knowledge bases are built in several different languages, achieving cross-lingual knowledge alignment will help people in constructing a coherent knowledge base, and assist machines in dealing with different expressions of entity relationships across div...
متن کاملLearning Multilingual Subjective Language via Cross-Lingual Projections
This paper explores methods for generating subjectivity analysis resources in a new language by leveraging on the tools and resources available in English. Given a bridge between English and the selected target language (e.g., a bilingual dictionary or a parallel corpus), the methods can be used to rapidly create tools for subjectivity analysis in the new language.
متن کاملImproving Neural Knowledge Base Completion with Cross-Lingual Projections
In this paper we present a cross-lingual extension of a neural tensor network model for knowledge base completion. We exploit multilingual synsets from BabelNet to translate English triples to other languages and then augment the reference knowledge base with cross-lingual triples. We project monolingual embeddings of different languages to a shared multilingual space and use them for network i...
متن کاملApplying Wikipedia's Multilingual Knowledge to Cross-Lingual Question Answering
The application of the multilingual knowledge encoded in Wikipedia to an open–domain Cross–Lingual Question Answering system based on the Inter Lingual Index (ILI) module of EuroWordNet is proposed and evaluated. This strategy overcomes the problems due to ILI’s low coverage on proper nouns (Named Entities). Moreover, as these are open class words (highly changing), using a community–based up– ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i10.21330